Introduction and datasets

Column

Introduction

For my corpus, I will use two of the playlists that spotify made for me. The first playlist is “Jouw topnummers van 2020” and the second playlist is “Jouw topnummers van 2021”. What I find interesting about these playlists is that they are in some way representative of the music that I listened to in 2020 and 2021. I’m interested in seeing if there are specific things that have changed when it comes to my music taste. I think the tracks in these playlists are quite representative when it comes to the music that I listened to during those periods of time.

When comparing two tracks, it seemed most logical to me to compare the number one songs from both years. For 2020 that song is : Why Why Why Why Why - Sault For 2021 that song is: I know you, I live you - Chaka Khan

Some features that I will look into include valence, energy, tempo, key and mode. I am not sure what my general preference is when it comes to these features, so I am interested to see if there are some things that stand out.

On the side, you can see a little bit of the tables that contain the information about my toptracks from 2020 and 2021. The full table consists of 100 tracks per playlist, with 60 columns containing information about these tracks.


The 2020 playlist will be represented by the color red in some of the graphs, while the 2021 playlist will be represented by the color blue.

Typical and Atypical tracks

Some typical tracks from the 2020 playlist:

Why Why Why Why Why - Sault

Colors - Black Pumas

Exit music (for a film) - Radiohead

Blue World - Mac Miller

H.f.g.w (Canyons Drunken Rage) - Tame Impala

Some atypical tracks from the 2020 playlist:

Fam - sor

Daisy - Ashnikko

Fragments of stasimon of Orestes by Euripides - Petros Tabouris

Some typical tracks from the 2021 playlist:

I know you, I live you - Chaka Khan

You Don’t Listen - General Elektriks

Famous - The Internet

Blackstar - David Bowie

Exit music (for a film) - Radiohead

Some atypical tracks from the 2021 playlist:

Temporary - Lauren Jauregui

SHUM - Go_A

Symphony No.5: IV. Adagietto. Sehr Langsam - Gustav Mahler

Column

Table 2020

key loudness mode speechiness acousticness instrumentalness
9 -6.640 0 0.0357 0.338000 6.16e-02
0 -5.500 1 0.0307 0.000566 1.01e-05
3 -8.269 0 0.0369 0.108000 1.56e-01
2 -7.738 1 0.1530 0.478000 2.81e-03
9 -8.810 1 0.0496 0.009000 8.22e-04
1 -8.219 1 0.0286 0.533000 2.50e-01
4 -7.571 0 0.0620 0.007690 0.00e+00
10 -8.679 1 0.3340 0.118000 0.00e+00
11 -5.704 1 0.0332 0.047900 5.98e-03
7 -8.922 0 0.0576 0.670000 4.29e-01

Table 2021

key loudness mode speechiness acousticness instrumentalness
7 -9.506 0 0.0417 0.0722 1.51e-02
1 -7.288 1 0.0419 0.3920 4.42e-02
5 -9.563 0 0.0288 0.3200 9.72e-04
9 -5.724 1 0.0413 0.0328 1.34e-01
2 -6.636 1 0.0431 0.0717 7.03e-03
4 -7.737 1 0.0576 0.2100 0.00e+00
4 -6.005 0 0.1460 0.1770 1.18e-04
10 -12.529 1 0.0330 0.7580 2.67e-04
1 -9.473 1 0.0587 0.8080 1.64e-05
7 -3.727 1 0.0439 0.0492 5.17e-03

Energy

Column

Summary


2020:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0440  0.4387  0.5535  0.5581  0.6700  0.9310 

2021:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00639 0.44575 0.58200 0.57077 0.71400 0.97600 

In the histogram, you can see the energy count per playlist. The included summary shows that both the median and the mean are not too far apart when you compare the 2020 and 2021 playlists. There is however, some difference when you look at the minimum and maximum energy. In my opinion this would mean that in 2021 I listened to a wider variety of music when it comes to energy.

The second chart shows that the 2020 playlist has a wider density. This could be explained by the minimum and maximum being less far apart, which makes the density wider.

Column

Energy Count

Energy per year 2

Energy-Valence

Column


Correlation 2020 - Energy / Valence
[1] 0.455756

Correlation 2021 - Energy / Valence
[1] 0.2139411

Energy valence

The first graph shows the relation between energy and valence for both 2020 and 2021. I also calculated the correlation between these two variables. That calculation suggests that there is more correlation between energy and valence in the 2020 playlist than in the 2021 playlist. This is visually confirmed by the second chart, which in addition to energy and valence, also shows minor and major. The values seem to be more scattered in the 2021 graph.

Something I found interesting about this graph is that in the 2020 chart, minor songs generally score higher than major songs when it comes to energy. Something I think could explain this is that the top ten songs with the highest energy are all (hard)rock songs. These songs are often quite high in energy, even when they are in a minor key.

Another thing I found interesting about this graph is that the song with both the least energy and valence in the 2021 chart, is actually a song that does not really belong in the playlist. Last year I took a Musicological History course, for which we had to take a listening test, to prove that we were able to recognize a song by hearing it. One of the songs that I struggled with while studying was Symphony No. 5: IV. Adagietto. Sehr langsam by Gustav Mahler. This is the song with the lowest energy and valence. This means that it is not really a representative song, since I didn’t listen to it because I wanted to, but because I had to.

Column

Energy Valence

Energy Valence Mode

Valence Boxplot

Danceability

Column 1

Danceability


Danceability says something about how suitable a track is for dancing. This is based on multiple musical elements. Some of those elements are energy, tempo, rhythm stability and beat strength. The 2020 playlist has a higher median and a higher mean when it comes to the danceability variable. What I find interesting about this is that the 2021 playlist had a higher mean energy. There was barely any difference between the two playlists when it comes to the average tempo, with the 2021 playlist scoring 1.5 BPM higher on the average tempo. When I look at the top ten songs with the highest danceability for both playlists, I don’t see any differences that would give a logical explanation as to why the 2020 playlist has a higher mean danceability. When I look at the top ten songs with the lowest danceability, it makes more sense that the 2021 playlist scored lower. There are two songs from a classical music playlist that I had to listen to for a school assignment. Besides that I listened to more rock songs in 2021, and a few of those songs apparently score really low on danceability.

Column 2

Summary


2020
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2930  0.5745  0.6835  0.6737  0.7833  0.9460 

2021
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0618  0.5102  0.6410  0.6224  0.7455  0.9490 

Lowest Danceability


2020
danceability track.name
0.293 Exit Music (For a Film)
0.303 Feels Like We Only Go Backwards
0.360 Karma Police
0.399 Fragment of stasimon of Orestes by Euripides - Ancient manuscript
0.418 High And Dry
0.430 Why Don’t You
0.444 my future
0.447 Sorry Ain’t Enough
0.452 Freakin’ Out On the Interstate
0.462 Red Eyes
2021
danceability track.name
0.0618 Symphony No. 5: IV. Adagietto. Sehr langsam
0.2010 Electioneering
0.2760 Sonata I.X.: I. Foreboding
0.2930 Exit Music (For A Film)
0.3190 Blouse
0.3320 Yesterday - Remastered 2009
0.3420 Bodysnatchers
0.3500 Bury Me
0.3660 Blackstar
0.3660 Knights of Cydonia

Column 3

Distribution of Danceability Data.

Tempo features

Column


An observation that I made is that the song with the highest tempo, can actually be viewed in halftime. It’s the song parachutes by Jordan Mackampa. The song is in my opinion in halftime so that would mean that it is not a 193 BPM but 97 BPM.

The tempo is the highest for songs with a 4/4 time signature.

Even though the 2021 playlist has less outliers, it has a wider range when it comes to the lower and the upper fence. This can be seen in the boxplot.

When at the songs that fall above the upper fence for the 2020 playlist, I can see that they have all been synced to a wrong tempo octave. Non of the top 5 highest tempo songs from that playlist actually have a higher tempo than 158 BPM, the upper fence.


2020
# A tibble: 1 × 3
  minTempo meanTempo maxTempo
     <dbl>     <dbl>    <dbl>
1     54.6      113.     194.

2021
# A tibble: 1 × 3
  minTempo meanTempo maxTempo
     <dbl>     <dbl>    <dbl>
1     66.0      114.     180.

SD Tempo

Tempo Boxplot

SD Tempo

time signature - tempo -mode

Tempograms

Text

Both of the tempograms show issues with correctly identifying the songs in the right tempo octave. The 2021 tempogram is even more flawed than the 2020 tempogram, since it barely shows any difference between 113 bpm and the other tempo’s.


playlist_name track.name tempo
Your Top Songs 2020 Why Why Why Why Why 114.762
playlist_name track.name tempo
JTNV2021 I Know You, I Live You 113.229

Tempograms

2020

2021

Keys and chords

Column


The most used key in 2021 is C#-major. The most used key in 2020 is A-minor. The 2021 playlist contains 53 songs in a major key, and 47 in a minor key. The 2020 playlist contains 56 songs in a minor key and 44 in a major key.
This can be linked to the mean valence, energy and tempo for both playlists as well. Those three features are all higher for the 2021 playlist. From this the conclusion can be drawn that the 2021 playlist was a bit ‘happier’ than the 2020 playlist.


Mean Valence 2020
[1] 0.485539
Mean Valence 2021
[1] 0.503828

Column

Key name histogram

Key minor major

Mode

Loudness

Graphs

Boxplot

Table Lowest Loudness 2020

loudness track.name
-20.045 Epitaph of Seikilos
-14.903 I Loved Another Woman
-14.602 40 Nights
-14.002 Sigurt - Remix
-13.418 Sorry Ain’t Enough
-13.013 Te extraño, pero…
-12.724 Fragment of stasimon of Orestes by Euripides - Ancient manuscript
-12.706 Don’t Let Get You Down - Edit
-12.304 Goodie Bag
-11.943 Territory

Table Lowest Loudness 2021

loudness track.name
-34.643 Symphony No. 5: IV. Adagietto. Sehr langsam
-24.664 Delicate Felt
-20.638 Blouse
-20.428 Sonata I.X.: I. Foreboding
-19.132 Maria Elena
-18.204 Hold On - Remastered 2010
-16.223 Hazey - Stripped
-15.351 Why Try to Change Me Now
-15.297 Sunshine
-13.924 Day Dreaming - 2021 Remaster

Text

Explaination

The means and medians for both playlists are quite similar. However, the 2021 shows a broader distribution of the quantiles. The 2021 playlist also shows more outliers. The outlier with -35dB is a classical piece that I had to listen to for school. Multiple of the outliers are classical music, which would explain why they may have a lower loudness value.

Mean

Mean Loudness 2020
[1] -8.54666
Mean Loudness 2021
[1] -8.78596

Speechiness and Acousticness

Means

Explaination

As you can see, the mean for acousticness is higher for the 2021 playlist and the mean for speechiness is higher for the 2020 playlist. The ‘outlier’ that can be seen in the histogram for the 2020 playlist is a rap song called D/Vision by JID. The fact that it is a rap song explains the high level of speechiness. Most of the songs with high acousticness are classical songs, of which some are from a classical playlist that I had to study for a school assignment.

Something that is interesting to me is that the 2021 playlist shows a negative correlation when it comes to acousticness and speechiness.

Correlation 2020, 2021 - Acousticness, Speechiness
[1] 0.02179429
[1] -0.08261927

Means

mean speechiness 2020
[1] 0.09523
mean speechiness 2021
[1] 0.081393
mean acousticness 2020
[1] 0.2526544
mean acousticness 2021
[1] 0.2772244

Graphs

Histogram Speechiness

Histogram Acousticness

Acousticness - Speechiness

Chroma Features

Top song of 2020, SAULT - whywhywhywhywhy

Top song of 2020, SAULT - whywhywhywhywhy, Pitch

Timbre

Top song of 2021, Chaka Khan - I know you, I live you

Top song of 2021, Chaka Khan - I know you, I live you, Pitch

Timbre,

Text


Pitch

Why Why Why Why Why During the song, D is the most used pitch. Other common pitches are F and A. This could be explained by the fact that a D-Minor chord consists of D-F-A. Another pitch that seems to have a high magnitude is Db. An explanation could be that the song features the 7th tone often, but I think it is more likely that the 7th tone here is C, which is also shown to have a relatively high magnitude for some parts of the song. In my opinion it is more likely that the D was read by the algorithm as a Db.

I Know You, I Live You The pitches that have the highest magnitude are C, Db, D, G and A. According to the spotify analysis the key for this song is Gm. This would explain the high magnitude for the G and D pitch. The dominant key for Gm is Dm, which would explain the high magnitude for the A tone, since the D-minor chord consists of D-F-A.

Timbre It can be said that for both the 2020 and the 2021 song the lower coefficients capture more of the timbre information. ‘Why Why Why Why Why’ shows a high magnitude for c01, c02 and c03. ‘I Know You, I Live You’ mainly shows a high magnitude for c02.

Self Similarity

Column

Self similarity 2020

Self similarity 2021

Explaination

The structure of the 2020 top track, ‘Why Why Why Why Why’, can be seen as an A-B-C-B-C-B structure.The parts of the song that are instrumental are often found at the beginning of a new letter, for example, the darker blue block at duration 0-16, the intro and 59-65. This is a song that features a lot a variation between the different verses and choruses. I think that explains why the matrix is not the cleanest.

The structure of the 2021 top track, ‘I Know You, I Live You’ can be seen as an A-B-A-B-B structure.

Clustering

Text


Some songs that are linked in the 2020 dendrogram include ‘She Can’t Love You’ and ‘Mad At Me’. These songs are both R&B songs. ‘Back Pocket’ and ‘Lockdown’ are both songs in which the baseline plays an important role. They also have a similar timbre coefficient. The top song for 2020, ‘Why Why Why Why Why’, is clustered with ‘Breeze’, which is the 18th song on that playlist. They have similar values for speechiness, timbre coefficient c02 and liveness.


Some songs that are linked in the 2021 dendrogram include ‘Krunk’ and ‘Little Lady’. These songs being linked surprised me, because they don’t sound very similar to me. When looking at the heat map, I can see that they are actually quite similar if you look at speechiness, loudness and key. These are some features that I don’t necessarily notice consciously when I listen to music, but that are clearly still important for clustering music. Another cluster that surprised me is ‘You Oughta Know’ and ‘Famous’, one being a rock song, while the other is R&B. They are similar when you look at the timbre features, having very similar values for multiple timbre coefficients. They also have similar values for instrumentalness and danceability.

Heatmaps

Heat map 2020

Heat map 2021

Conclusion

Some of the features that I analysed for this portfolio show high similarity between the 2020 playlist and the 2021 playlist. They gave similar energy histograms, both used 4/4 the most for time signature, and had almost the same tempo mean and median. Other features like mean acousticness and median loudness were almost the same for both playlists.

There were also some features that were less similar. The 2021 playlist showed a broader width for the valence quantiles. The songs were also more scattered around the 2021 valence-energy-mode graph than in the 2020 graph. In other words, the 2020 playlist showed more correlation between energy and valence.

Both of the top songs in these playlists showed a high magnitude for the pitched D and Db, which is interesting to me because D and Db were the number 2 and 3 most used keys for 2021, but they were way less used in 2020.

And then there were the cases of features that didn’t show a lot of similarity, but that were in my opinion influenced by songs that didn’t belong in my top tracks. For instance, the danceability data was influenced by a couple of tracks that ended up in my top 100 because I had to study them for a school assignment. These same songs also influenced the loudness and speechiness features. There were also some problems with analysing the tempo features, since some of the outliers were in my opinion in the wrong tempo octave.